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INTRODUCTION 

The  principal  uncertainties  in  clearing  a  minefield  are  about  the  number  and  type  of 
mines  that  have  not  yet  been  removed.  In  this  paper,  we  take  the  point  of  view  that  the 
number  of  mines  should  be  thought  of  as  a  random  variable  M,  and  our  main  object  is  to 
show  the  advantages  of  a  particular  “Katz”  class  of  probability  distributions  for  M.  We 
assume  throughout  that  the  result  of  clearance  effort  is  to  remove  every  mine  with  a 
known  “clearance  level”/>,  independently  of  all  other  mines.  If  p=l,  we  have  the  case  of 
exhaustive  clearance.  A  theory  is  hardly  required  when  clearance  is  exhaustive,  and  the 
methods  outlined  below  will  be  of  no  use,  so  we  assume  p<\. 

Barring  the  possibility  of  exhaustive  clearance,  any  cleared  minefield  will  retain  some 
residual  risk  to  transitors,  and  quantifying  that  riskiness  should  be  one  of  the  main  goals 
of  theory.  Depending  on  the  assessment  of  risk,  it  may  be  advisable  to  either  continue 
clearance  or  declare  that  the  minefield  is  sufficiently  cleared  that  the  risk  to  transitors  is 
bearable.  In  simplest  terms  this  risk  is  measured  by  Simple  Initial  Threat  (SIT),  the 
probability  that  the  first  minefield  transitor  will  be  killed  or  damaged  by  a  mine.  Since 
SIT  depends  strongly  on  the  number  of  mines  remaining,  which  in  turn  depends  on  the 
number  M  that  were  there  initially,  it  is  hard  to  imagine  how  a  basically  subtractive 
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clearance  aetivity  could  result  in  suffieient  knowledge  about  M  to  support  a  eomputation 
of  SIT,  unless  some  information  about  Mis  input  initially.  This  is  in  faet  the  case. 
Aceording  to  Bayes  theorem,  the  required  information  must  take  the  form  of  a  probability 
distribution  for  M.  Katz  distributions  are  a  two-parameter  elass  of  probability 
distributions  that  is  broad  enough  to  be  practically  useful,  while  simultaneously  being 
narrow  enough  to  permit  a  significant  theory. 

Katz  distributions  have  potential  applications  in  areas  other  than  minefield  clearanee. 
In  considering  the  reliability  of  software,  for  example,  one  might  begin  by  supposing  that 
there  are  an  unknown  number  of  bugs,  M,  some  of  which  are  diseovered  and  removed  in 
the  proeess  of  using  and  simultaneously  improving  the  software  (Jelinski  and  Moranda, 
1972).  In  other  applieations  M  might  represent  ore  poekets,  oil  deposits,  sehools  of  fish 
or  unexploded  ordnanee.  Nonetheless,  the  language  of  minefield  clearance  will  be  used 
exclusively  below. 

The  next  major  seetion  describes  Katz  distributions,  utilizing  several  subseetions  and 
referring  to  appendices  for  proofs  of  theorems. 

KATZ  DISTRIBUTIONS  AND  THEIR  USES  IN  MODELING  MINEFIELDS 
Generalities 

Suppose  that  a  region  contains  an  unknown  number  of  mines,  M  that  an  action  is 
taken  to  find  and  remove  the  mines,  and  that  Y  mines  are  in  fact  removed  by  the  action. 
The  number  of  mines  that  remain  is  X  =  M  -  Y.  It  is  X  that  determines  the  threat  of  the 
minefield  to  subsequent  transitors.  Even  though  Y  is  known,  X  is  not  known  exaetly 
beeause  M  was  not  known  in  the  first  plaee.  Still,  the  nature  of  the  elearance  action 
taken,  together  with  Y,  may  provide  useful  information  about  X  through  an  applieation  of 
Bayes  theorem.  If  the  initial  distribution  of  M  is  Katz,  then  the  distribution  of  X  will  be 
of  the  same  type,  a  feature  that  enables  a  multi-stage  approach  to  minefield  elearance 
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because  the  Katz  output  of  one  stage  can  be  the  Katz  input  to  the  next.  Katz  distributions 
also  have  some  other  appealing  properties,  so  there  are  good  reasons  to  begin  a  minefield 
clearance  analysis  by  assuming  a  Katz  distribution  for  M. 

Definition  and  basic  properties 

Katz  (1965)  describes  a  probability  distribution  xg,  ...  with  the  property  that 

J  !  J  Y+j 

The  distribution  (1)  will  be  referred  to  as  a  “Katz  distribution  with  parameters  a  and  (5" , 
provided  a  and  /3 meet  certain  restrictions.  We  will  use  the  notation  M  ~  K(a,  to 

express  this  compactly,  with  the  ~  symbol  standing  for  “is  distributed  as”.  Given  xg, 
equation  (1)  sequentially  determines  xj,  X2,  ...  .  Since  the  sumxg  +  xi  +  ...  must  be  1,  xg 
is  determined  implicitly. 

The  parameter  a  must  be  nonnegative,  since  it  is  the  ratio  x^/xg,  and  ji must  be  less 
than  1  to  enforce  convergence  to  0  for  large  j.  If  y0<  0,  then  (1)  will  eventually  produce 
negative  probabilities  unless  -aJfi  is  an  integer.  To  prevent  this  possibility,  —alf3is 
required  to  be  an  integer  when  /3  is  negative.  The  restrictions  on  parameters  are  thus  that 

«  >  0,  /3<\,  and  -a! P is  an  integer  when  P<Q.  (2) 

00 

Let  the  generating  function  be  g(z\a,P')  =  ^x^z^  .  Katz  (1965)  showed  that 

7=0 

g(z;a,y0)  =  [(l-y0)/(l-y0z)f^  (3) 

with  (3)  being  interpreted  as  exp(«(z-l))  (the  limit  as  P approaches  0)  if  y0=  0.  It  follows 
that  the  initial  probability  must  be 

Xo  =  g(0;a,75)  =  (l-75)“/^  (4) 

or  xo=exp(-a)  if  P=0.  If  M~K(a,P)  then  (3)  implies  that 
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E{M)  =  iu  =  al{\- P)  and  Var{M)  =  <j^  =  a/{\- fiY  . 


(5) 


It  is  not  hard  to  establish  that  a  Katz  distribution  is 

•  if/3  <  0,  a  binomial  distribution  with  -a! P  trials  and  success  probability  pi{J3-  1), 

•  if  y0=  0,  a  Poisson  distribution  with  mean  a,  or 

•  ifp  >  0,  a  negative  binomial  distribution.  If  a! P  is  an  integer,  this  is  the 
distribution  of  the  number  of  failures  until  the  at P^  success  in  a  sequence  of 
independent  trials  where  the  failure  probability  is  at p.  However,  the  “number  of 
successes”  a! Pcaa  actually  be  any  positive  real  number. 

The  Katz  class  includes  no  other  distribution,  so  it  can  be  thought  of  as  the  union  of  three 

familiar  types. 

Since  the  mean  and  variance  are  more  familiar  parameters  than  a  and  P,  the  solution 
of  (5)  for  a  and  P  in  terms  of  /u  and  cr^  may  be  useful: 

P  =  \-lulcP'  and  a  =  i^la^  .  (6) 

Always  //  >  0  and  cr^  >  0  in  (6),  but  some  nonnegative  (//,  d)  pairs  are  impossible 
because  of  the  restriction  that  -a/y^must  be  an  integer  when  P  is  negative.  This 
restriction  is  not  imposed  by  Katz  (1965),  who  simply  zeros  all  probabilities  after  (and 
including)  the  first  that  (1)  would  make  negative.  Unfortunately,  this  tactic  falsifies 
equations  (3)  -  (6).  For  example  suppose  a=\  and  P=  -2.  Then  (1)  has  xpXQ  =  1  and 
xpxi  =  -1/2,  so  Katz  would  take  xq  =  xi  =  1/2,  x,-  =  0  for  i  >  2.  The  mean  of  this 
distribution  is  //  =  1/2,  not  1/3  as  would  be  obtained  by  (5).  The  fact  that  (3)  -  (6)  are 
false  when  P  <  0  and  -a! p  is  not  an  integer  is  not  recognized  in  Katz  (1965),  nor  in 
subsequent  restatements  such  as  Johnson  and  Kotz  (1969). 

Since  P=  1  -  ///cr^,  all  (//,  cr^)  pairs  where  0  <//<  cr^  are  possible.  This  covers 
situations  where  there  is  great  uncertainty  about  the  number  of  mines  present,  as  is 
typically  the  case  in  minefield  clearance. 

The  case  where  a  =  P  =  1  is  a  “noninformative  prior”  in  the  sense  that  the  ratio  Xj+\lxj 
is  1  for  all y  >  0.  It  is  a  limiting  case  of  the  negative  binomial  where  all  nonnegative 
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numbers  are  equally  likely  and  q  approaehes  infinity.  It  is  not  a  true  distribution  beeause 
all  of  the  probabilities  xj  must  approaeh  0,  but  may  nonetheless  serve  as  a  prior 
distribution  for  operations  sueh  as  the  one  deseribed  next. 

The  Sample-Observe-Subtract  (SOS)  Property 

The  main  property  that  makes  Katz  distributions  useful  in  minefield  elearanee  is  that 
the  elass  is  elosed  under  SOS  operations.  Formally, 

Theorem  1:  Let  Mbe  the  number  of  mines,  suppose  K(  a,  let  Tbe  the  number 

of  mines  removed  when  eaeh  mine  is  removed  with  known  probability  p,  independently 
of  the  others,  and  letX=  M-  Tbe  the  number  of  mines  remaining  (not  removed).  Then, 
eonditional  on  the  event  {Y  =  y)  being  given,  X  ~  K{a' ,  (5'),  where  a'  and  (5'  are  given 
by  (7)  with  q  =  \  -p. 

a'  =  q{a  +  /3y)  P'  =  qP.  (7) 

A  proof  ean  be  found  in  Appendix  A.  The  same  proof  ean  also  be  found  in  Washburn 
(1996),  as  ean  proofs  of  other  theorems. 

Glazebrook  and  Boys  [1995]  introduee  a  larger  elass  of  distributions  that  is  still 
elosed  under  the  SOS  operation.  Binomial  distributions  are  generalized  to  “light  tailed” 
distributions,  negative  binomial  distributions  are  generalized  to  “heavy  tailed” 
distributions,  and  the  Poisson  distribution  eontinues  to  play  its  eentral  role.  The  Katz 
elass  ean  be  regarded  as  a  two-parameter  subset  with  additional,  eonvenient  analytie 
properties. 

Theorem  1  resolves  a  eertain  minefield  paradox.  Suppose  that  a  minefield  is  cleared 
to  the  .5  level,  and  that  Y  mines  are  removed  in  the  process.  One  might  argue  that  Y  mines 
must  remain,  since  only  half  have  been  removed.  But  how  can  it  be  that  the  number 
estimated  to  remain  should  increase  with  the  number  cleared,  since  clearance  is  by  its 
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nature  subtractive?  The  paradox  disappears  when  one  realizes  that  clearance  to  a  known 
level  provides  both  evidence  and  removal.  When  /3>  Q,  the  evidence  part  dominates  and 
the  estimated  number  remaining  does  indeed  increase  with  the  number  removed.  When 
y9<  0,  the  removal  part  dominates.  In  the  Poisson  case  y0=  0,  the  number  removed  does 
not  affect  the  distribution  of  the  number  that  remain. 

Since  clearance  is  a  process  carried  out  in  time,  it  is  likely  that  clearance  times 
Ty,  . . .,  Ty  will  also  be  known  when  {Y  =  y)  is  observed.  If  the  magnitudes  of  these  times 
influence  the  posterior  distribution  of  M,  then  the  clearance  times,  as  well  as  the  number 
of  mines  cleared,  should  be  accounted  for.  However,  there  is  no  effect  of  this  kind  as  long 
as  the  clearance  level  p  is  known,  regardless  of  the  initial  distribution  of  the  number  of 
mines.  The  proof  of  this  statement  can  be  found  in  Theorem  2  of  Appendix  A,  and  its 
corollary. 

Simple  Sampling  From  a  Katz  Distribution 

Theorem  1  governs  the  case  where  Y,  the  number  of  mines  removed,  is  observed. 
There  are  also  circumstances  where  Y  is  not  observed.  One  example  is  where  M  is  the 
number  of  mines  in  region  S,  but  only  some  fraction  q  oiS  (call  it  S' ,  possibly  a  transit 
channel)  is  of  concern.  If  q  is  interpreted  to  be  the  probability  that  any  given  mine  will  be 
in  S' ,  then  the  number  of  mines  X  in  S'  is  the  number  remaining  after  sampling  M  at  the 
level  q,  but  without  observing  the  results  of  the  sample.  Another  example  is  in  minefield 
clearance  where  the  clearance  plan  is  the  subject  of  analyis.  Since  7 has  yet  to  be 
observed,  any  forecast  of  residual  threat  cannot  be  based  on  Y.  Theorem  3  states  that  X  is 
still  Katz,  even  when  Y  is  not  given. 

Theorem  3:  Let  M  be  the  number  of  mines,  suppose  M~K{a,  P),  and  let  Abe  the 
number  of  mines  in  the  sample  when  each  mine  is  included  with  probability  q, 
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independently  of  the  others.  ThenX- y0'),  where  a'  and  (3'  are  given  by  (8)  with 
p=\-q. 


a  = 


aq 


f3q 


(8) 


\-f3p  X-pp 

A  proof  ean  be  found  in  Appendix  A.  Of  eourse,  the  number  of  mines  Y  removed  from  M 
is  also  Katz,  but  with  p  and  q  reversed  in  (8). 


Simple  Initial  Threat  (SIT)  for  a  Katz  Distribution 

Uneertainty  about  the  number  of  mines  implies  uneertainty  about  whether  the 
minefield  is  safe  for  a  transitor  to  eross.  The  simplest  quantifieation  is  to  define  the 
parameter 

t  =  probability  that  a  given  uneleared  mine  kills  the  transitor,  (9) 

and  then  assume  that  all  mines  aet  independently.  For  example,  suppose  that  mines  are 
distributed  uniformly  and  independently  in  a  minefield  with  width  W,  that  eaeh  mine 
aetuates  with  probability  B  if  the  transitor’ s  straight  line  path  takes  it  to  within  AH  of  the 
mine.  Assume  also  that  the  transitor  will  be  killed  with  probability  D,  eonditional  on 
aetuation.  Then,  as  long  as  W»  A  and  the  transitor’ s  path  is  near  the  center  of  the 
minefield  (ignoring  edge  effects,  in  other  words),  the  parameter  t  is  ABDIW.  However,  t 
does  not  need  to  be  calculated  in  that  way  -  the  calculation  could  involve  actuation 
curves,  navigation  errors,  and  edge  effects  as  in  Odle  (1977). 

The  transitor  is  assumed  to  encounter  the  mines  one  at  a  time.  As  long  as  the  transitor 
survives,  the  probability  that  the  next  mine  kills  it  is  by  assumption  t,  independently  of 
any  others.  The  probability  that  all  M  mines  fail  to  kill  the  transitor  is  therefore  (1  - 
and  the  probability  that  the  first  transitor  to  enter  the  minefield  is  killed  is  the  Simple 
Initial  Threat  (SIT): 

SIT=  I -.E((I -0^).  (10) 
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If  ir(«,  (i),  equation  (10)  can  be  evaluated  by  substituting  1  -  t  for  z  in  (3),  obtaining 


SIT=l-g(l-t;a,y^.  (11) 

If  clearance  is  carried  out  and  Y  observed  before  the  transitor  enters  the  minefield,  then 
a'  and  (5'  from  (7)  should  be  substituted  for  a  and  y^in  (1 1).  If  7 has  not  been  observed, 
then  a'  and  P'  from  equation  (8)  should  be  used  instead. 

The  method  of  forecasting  SIT  in  current  Navy  tactical  decision  aids  such  as 
NUCEVL  and  UCPLN  (Wagner,  et  al,  1999)  is  based  on  (7)  and  (11)  with  the 
aforementioned  “noninformative  prior”  for  M,  essentially  a  limiting  Katz  distribution 
where  P=1  and  a=l.  This  distribution  is  “conservative”  in  the  sense  that  E{M)  is  infinite, 
but  such  conservatism  can  have  unexpected  implications.  For  example,  a  minefield 
cleared  to  a  very  low  level,  with  no  mines  found,  would  be  assessed  to  have  an  SIT  of 
nearly  1. 

Threat  to  Transitors  after  the  First 

The  second  and  following  transitors  are  much  harder  to  deal  with  analytically  than 
the  first.  Odle  (1977)  gives  formulas  for  several  multi-transitor  measures,  but  derivation 
is  non-trivial  even  when  the  number  of  mines  is  known.  An  exception  is  the  “catastrophic 
failure”  probability  c„,  the  probability  that  none  of  n  transitors  is  sunk,  a  concept  and 

term  that  were  introduced  by  Horrigan  (1973).  Let  be  the  catastrophe  probability  for 
a  single  mine.  Then  c„  is  simply  for  M  independent  mines.  If  M ~  K{a,  P),  then  the 

catastrophe  probability  is 

c„^E{Q::)  =  g{Q„-a,f3),  (12) 

where  g(  )  is  again  the  generating  function  given  by  (3).  Odle  (1977)  gives  the  formula 
when  M  is  Poisson,  a  special  case.  As  in  the  case  of  SIT,  the  important  thing  is  that  the 
generating  function  of  Mbe  known. 
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The  single-mine  eatastrophe  probability  would  be  (1  -  t)"  if  eaeh  transitor’s  traek 
were  ehosen  independently  of  the  others,  but  multiple  transitors  are  usually  assumed  to 
attempt  to  follow  the  same  traek.  In  that  ease  the  eorreet  “eonfigured”  eomputation  of 
ean  beeome  a  signifieant  task  in  itself,  partieularly  if  navigation  errors  are  involved,  but 
the  degree  of  diffieulty  has  nothing  to  do  with  the  distribution  of  the  number  of  mines 
present.  Regardless  of  the  method  used  for  eomputing  or  measuring  Qn,  (12)  generalizes 
from  one  mine  to  a  Katz  distributed  number  of  mines. 

There  appear  to  be  no  simple,  elosed-form  formulas  other  than  (12)  when  multiple 
transitors  are  eonsidered,  even  when  the  number  of  mines  is  known.  There  are  praetieal 
methods  for  ealeulating  the  easualty  distribution  and  other  statistieal  measures  (Odle, 
1977),  but  the  methods  do  not  simplify  when  the  number  of  mines  has  a  Katz  (or  even  a 
Poisson)  distribution. 

A  simple  upper  bound  on  E^,  the  expeeted  number  of  easualties  out  of  n  transitors, 
ean  be  obtained  by  observing  that  the  number  of  easualties  eannot  exeeed  M,  and 
therefore  that  eannot  exeeed  E{M).  If  eaeh  mine  eauses  a  easualty  with  probability  at 
most  D  whenever  it  detonates,  then  a  better  bound  is 

E„<DxE{M).  (13) 

If  K(a,  /3),  then  E{M)  is  given  by  (5).  Sinee  E^  is  neeessarily  a  nondeereasing 
funetion  of  n,  (13)  is  sharpest  for  large  values  of  n.  Of  eourse,  Ey  =  SIT. 

Sums  and  Partitions  of  Katz  Random  Variables 

Suppose  there  are  n  independent  mine  populations  Mj,  with  Mj  ~  K(a,-,  /^j), 
i=  1 ,  Let  the  total  number  of  mines  be  M  =  My  +. . .+  M„,  and  let  the  elearanee 
level  for  the  population  be  pi.  As  usual,  take  =  1  Let  the  number  of  type  i 
mines  eleared  be  T,-,  with  Y=  Yi  +. . .+  Y^,  and  let  the  number  remaining  be  Xf,  with X = 
Xi  +...+X^.  Of  eourse  X,-  +  Yf  =  Mi  and  X+  Y=  M.  These  mine  populations  might  be 
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different  kinds  of  mines  in  one  minefield,  the  numbers  of  mines  in  different  minefields, 
or  any  other  partition  of  M  into  n  parts.  Several  questions  arise  about  sueh  mixed 
minefields 

•  Does  M  have  a  Katz  distribution? 

•  If  all  of  the  Yi  are  observed,  does  Xhave  a  Katz  distribution? 

•  If  only  Y  is  observed,  without  knowing  the  mine  types,  does  Xi  have  a  Katz 
distribution? 

Theorem  4  and  its  eorollaries  in  Appendix  A  deal  with  these  questions.  The  answers 
have  a  tendeney  to  be  a  diseouraging  no,  the  exeeptions  being  when  all  of  the  p,  are 
equal,  or  better  yet  when  they  are  all  0;  i.e.,  when  all  populations  are  Poisson. 

The  answer  to  the  important  third  question  is  yes  if  T=  0,  sinee  the  observation  that 
T  =  0  is  equivalent  to  the  observation  that  7,-  =  0  for  all  i.  The  answer  is  also  yes  '\ip=  1 , 
sinee  in  that  ease  A,-  =  0  for  all  i.  One  might  hope  that  the  answer  would  still  be  yes  even 
if /»  <  1  and  T>  0,  provided  Pi  =  P  for  all  i,  sinee  the  latter  eondition  is  suffieient  for  Y  to 
be  Katz  (Corollary  1  to  Theorem  4).  Unfortunately,  this  is  not  true.  Washburn  (1996) 
shows  that  eonditional  independenee  fails  unless  P=0.  This  is  further  evidenee  that  the 
Poisson  ease  is  an  espeeially  eonvenient  assumption  about  the  intial  number  of  mines. 

If  the  total  number  of  mines  in  a  minefield  is  Katz,  then  can  the  total  be  easily 
partitioned  into  several  independent  component  Katz  distributions?  This  question  might 
arise  because  mines  can  be  of  different  types,  or  because  mines  in  different  parts  of  a 
minefield  receive  unequal  clearance  effort.  The  situation  is  similar  to  that  with 
summation  -  the  only  useful  theoretical  results  are  in  the  case  where  all  of  the  Pi  are 
equal,  especially  if  they  are  equal  to  0.  Theorem  5  of  Appendix  A  summarizes  what  is 
known. 
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AN  EXAMPLE 

Suppose  it  is  known  that  approximately  50  mines  have  been  plaeed  in  an  area  with 
dimensions  5  km  long  by  2  km  wide.  Laeking  any  information  to  the  eontrary,  the  mines 
are  supposed  to  be  all  of  the  same  known  type,  and  to  be  seattered  uniformly  over  the 
area.  It  is  neeessary  to  elear  a  ehannel  through  the  area,  but  the  ehannel  needs  to  be  only 
200  m  wide,  so  only  about  10%  of  the  mines  should  be  expeeted  in  the  ehannel.  The  top 
graph  in  Figure  1  shows  the  Katz  distribution  seleeted  for  the  initial  distribution  of  M, 
with  a=4.5  and  P=0.1.  It  is  of  the  negative  binomial  type,  with  a  mean  of  5  and  a 
standard  deviation  of  2.48  mines.  If  (8)  is  solved  for  (a,P)  with  a'=4.5,  P'=.l,  and  q=.l, 
the  solution  is  (23.68,  .5263).  This  Katz  distribution  for  the  number  of  mines  in  the 
whole  area  has  a  mean  of  50  and  a  standard  deviation  of  10.27.  However,  only  the  mines 
in  the  ehannel  are  of  eoneem. 

We  suppose  that  eaeh  mine  has  a  sweepwidth  of  20  m  against  transitors,  within  whieh 
damage  is  eertain.  Sinee  the  ehannel  is  200  m  wide,  this  eorresponds  to  a  threat  from 
eaeh  mine  of  t=.l.  The  eorresponding  threat  from  the  mines  in  the  ehannel,  if  unswept,  is 
SIT=g(0.9,  4.5,  0.1)=0.392  (formula  (11)).  Suppose  it  is  desired  to  reduee  this  threat  to  .1 
by  elearing  the  minefield  to  some  level  p.  The  effeet  of  this  sweeping  is  to  ehange  (a,P) 
to  (a',  P')  aeeording  to  (8),  with  q=\  -p,  and  with  (11)  subsequently  predieting  SIT. 
The  required  elearanee  level  turns  out  to  be  0.789.  This  elearanee  level  must  be  aehieved 
by  seleeting  the  number  of  traeks  and  runs  of  eaeh  traek  appropriately,  eonsidering  the 
nature  of  the  mines  and  the  sweeping  forees.  A  taetieal  deeision  aid  sueh  as  UCPLN 
(Wagner,  et  al,  1999)  might  be  used  in  planning  how  to  aehieve  the  required  elearanee 
level. 

Suppose  that  10  mines  are  removed  in  the  proeess  of  effeeting  the  elearanee  plan,  a 
surprisingly  large  number,  given  the  prior  distribution.  Given  this  additional  information, 
the  residual  threat  of  the  minefield  is  no  longer  0.1.  It  ean  be  determined  by  first  using 


II 


equation  (7),  with  (a,P)  =(4.5,  .1)  and  j=10,  to  compute  (a',  P')  =  (1.1605,  .0211).  The 
corresponding  posterior  distribution  is  shown  in  the  lower  part  of  Figure  1.  The  posterior 
SIT  from  (11)  is  0.1112,  larger  than  0.1  because  P>0  and  a  large  number  of  mines  were 
found  (it  is  even  conceivable  that  the  minefield  would  be  more  threatening  after 
clearance  than  before,  although  the  number  of  mines  found  would  have  to  be  very  large 
for  that  to  happen).  If  a  threat  of  0.1112  is  still  felt  to  be  too  large,  then  the  clearance 
process  must  be  continued  until  SIT  is  sufficiently  small. 

The  clearance  process  outlined  above  is  sequential  in  nature,  with  the  need  for  further 
clearance  depending  on  the  results  of  clearance  to  date.  This  is  a  realistic  feature,  since  it 
is  characteristic  of  minefield  clearance  that  the  nature  of  the  minefield  is  determined  in 
the  process  of  clearing  it. 

This  example  has  been  taken  from  some  tutorial  notes  on  mine  warfare  models  that 
can  be  downloaded  (Washburn,  2005),  if  desired,  along  with  an  accompanying  Excel™ 
workbook  (Washburn,  2005).  Sheet  “Katz”  of  that  workbook  incorporates  the  formulas 
required  to  make  the  above  computations. 
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Figure  1:  Prior  and  Posterior  (v=10)  Katz  distributions  for  the  example, 

BAYESIAN  METHODS,  KATZ  DISTRIBUTIONS,  AND  DECISION  AIDS 

As  a  long-time  teacher  of  Decision  Theory  at  the  Naval  Postgraduate  School,  I  can 
assure  my  readers  that  it  is  not  easy  to  convince  humans  to  quantify  uncertainty  using 
probability.  We  are  reluctant  to  do  it,  even  in  circumstances  of  far  less  importance  than 
clearing  a  minefield.  Bayes  Theorem  is  not  intuitive,  and  posterior  distributions  are 
sometimes  seen  as  having  more  to  do  with  magic  than  with  logic.  Given  these  human 
tendencies,  the  nature  of  current  tactical  decision  aids  for  minefield  clearance  is 
understandable.  The  user  is  never  asked  to  quantify  expectations  about  the  most 
important  parameter  of  a  minefield  —  the  number  of  mines  that  are  present.  As  a  result, 
the  effectiveness  of  clearance  must  be  discussed  in  terms  of  clearance  level  p,  rather  than 


13 


the  more  natural  quantity  SIT.  I  have  eneountered  Navy  offieers  who  think  of  the  two 
quantities  as  opposites;  that  is,  that  SIT  must  be  O.I  if  />=0.9.  This,  too,  is  natural,  albeit 
wrong,  sinee  it  is  natural  to  expeet  deeision  aids  to  output  quantities  that  are  direetly 
relevant  to  deeision  making.  But  SIT  and p  are  not  opposites.  Probability  theory, 
eorreetly  applied,  is  simply  unable  to  determine  SIT  or  any  other  measure  of  threat 
without  making  some  assumption  about  the  number  of  mines  initially  present.  One  ean, 
of  eourse,  make  a  noninformative  prior  assumption  for  the  initial  number  of  mines,  but 
this  is  simply  getting  the  user  out  of  the  loop  by  utilizing  a  (pessimistie)  default 
assumption.  One  way  or  another,  an  assumption  is  required. 

Even  if  one  aeeepts  the  idea  that  the  number  of  mines  M  must  be  thought  of  as  a 
random  variable  with  a  prior  distribution,  it  does  not  neeessarily  follow  that  M  should  be 
foreed  to  be  Katz.  The  Katz  elass  is  elosed  under  some  important  operations  sueh  as  SOS, 
but  not  under  all  of  them.  Partieularly  when  multiple  mine  types  are  present,  it  is 
possible  to  make  reasonable  observations  that  result  in  joint  posterior  distributions  not 
being  Katz,  or  (worse  yet)  even  independent,  even  when  all  of  the  prior  distributions  are 
Katz.  The  Katz  elass  may  not  be  large  enough. 

Why  not  permit  general  distributions  for  Ml  A  general  distribution  would  require 
storing  1000  numbers  if  the  maximum  eoneeivable  number  of  mines  were  999.  A  Katz 
distribution  requires  only  2,  but  performing  a  Bayesian  update  on  a  general  distribution 
would  nonetheless  be  trivial  with  a  modem  eomputer.  In  a  different  eontext, 
NODESTAR  (Stone  and  Corwin,  1995)  performs  Bayesian  updates  with  10^  states, 
rather  than  only  10^.  Using  a  general  distribution  would  also  have  the  advantage  that  any 
observation  with  a  known  eonditional  probability  law  eould  be  the  basis  of  a  Bayesian 
update,  whieh  is  not  tme  in  the  Katz  ease.  There  seem  to  be  some  good  arguments  for 
removing  all  restrietions  on  the  nature  of  the  prior  distribution.  The  idea  of  using  general 
distributions  does  not  beeome  eomputationally  unwieldy  until  multiple  random  variables 
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must  be  described  jointly.  If  there  were  for  example  five  mine  types,  the  number  of  each 
of  which  does  not  exceed  999,  then  there  would  be  lo'^  joint  possibilities.  Today’s 
computers  cannot  perform  Bayesian  updates  on  that  scale,  nor  will  tomorrow’s  be  able  to 
do  so.  Manipulating  general  distributions  over  two  or  three  mine  types  is  currently 
feasible,  and  potentially  useful,  but  dealing  with  5  mine  types  is  not. 

The  difficulty  with  Katz  distributions  when  multiple  mines  types  are  present  is  that 
there  are  reasonable  observations  that  destroy  the  Katz  property,  in  which  case  one  might 
as  well  have  started  with  general  distributions  in  the  first  place.  These  difficulties  largely 
disappear  if  all  initial  mine  distributions  are  independent  and  Poisson,  the  special  case 
where  P=0.  The  decision  aid  COGNIT  (McCurdy,  1987)  assumes  this.  Unfortunately, 
Poisson  distributions  have  only  a  single  parameter,  the  mean  number  of  mines,  which 
means  that  the  standard  deviation  is  not  independently  controllable  —  a  Poisson 
distribution  with  mean  100  necessarily  has  a  standard  deviation  of  (only)  10.  Inclusion  of 
distributions  with  f3>  0  (negative  binomial  distributions)  in  the  permitted  class  seems 
essential  to  model  the  large  uncertainty  about  mine  numbers  that  is  to  be  expected  in 
practice. 

Among  the  three  increasingly  general  classes  of  distributions  (Poisson,  Katz,  and 
general),  there  are  thus  serious  practical  objections  that  can  be  made  to  each.  A  mine 
clearance  decision  aid  developer  should  therefore  consider  the  distribution  class  question 
carefully.  Here  are  some  further  observations  in  favor  of  the  Katz  choice. 

Although  general  distributions  for  the  number  of  mines  are  in  most  cases 
computationally  feasible,  it  is  also  true  that  little  is  lost  by  restricting  input  distributions 
to  be  of  the  Katz  type.  In  fact,  the  Katz  restriction  may  be  operationally  welcome,  since 
the  entire  distribution  is  determined  from  only  two  estimated  numbers.  With  these 
thoughts  in  mind,  one  prototype  TDA  (MIXER)  proposed  by  the  author  (Washburn, 

1995)  employs  Katz  distributions  exclusively,  requiring  the  user  to  quantify  uncertainty 
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by  providing  a  mean  and  standard  deviation  for  eaeh  mine  type.  MIXER  relies  entirely  on 
input  files  and  keyboard  responses  from  the  user.  Paes  (2001)  deseribes  a  more  user- 
friendly  version  Javamix  with  a  GUI  and  some  graphies. 

The  Katz  divisibility  properties  deseribed  in  Theorem  4  eould  also  prove  handy.  If  a 
region  eontaining  M  mines  must  be  divided  into  two  physieal  parts  for  two  independent 
elearanee  operations,  then  Theorem  4  deseribes  how  the  numbers  of  mines  in  the  two 
parts  ean  be  independent,  Katz,  and  still  sum  to  M.  The  eomparable  operation  in  the 
general  ease  may  be  diffieult  or  impossible. 

Perhaps  most  important,  the  availability  of  an  analytie  expression  for  SIT  in  the  Katz 
ease  opens  up  the  possibility  (as  in  MIXER)  of  posing  the  mathematieal  problem  of 
minimizing  SIT,  subjeet  to  eonstraints  on  the  elearanee  effort,  a  eomputational  problem 
that  would  be  mueh  more  diffieult  in  the  general  ease.  When  a  variety  of  elearanee 
resourees  are  available  to  deal  with  a  variety  of  mine  types,  a  deeision  aid  eapable  of 
produeing  the  “best”  elearanee  plan  should  be  operationally  weleome. 

In  summary, 

•  Given  the  eentral  importanee  of  uneertainty  about  M  in  minefield  elearanee 
analysis,  there  are  some  good  arguments  for  taking  a  Bayesian  approaeh. 

•  In  a  Bayesian  analysis,  Katz  distributions  are  a  natural  elass  of  probability 
distributions  for  the  prior  distribution  of  M. 
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APPENDIX  A  (Theorems) 

The  theorems  in  this  appendix  are  all  taken  from  Washburn  (1996),  reprodueed  here  for 
eonvenienee. 

Theorem  1  states  that  Katz  distributions  are  elosed  under  the  Sample-Observe-Subtraet 
operation,  the  fundamental  property  that  makes  them  useful  in  mine  elearanee. 


Theorem  1:  Let  Mbe  the  number  of  mines,  suppose  K(  a,  (i),  let  Tbe  the  number 

of  mines  removed  when  eaeh  mine  is  removed  with  probability  p,  independently  of  the 
others,  and  letX=M-  Tbe  the  number  of  mines  remaining  (not  removed).  Then, 
eonditional  on  the  event  (T=y)  being  given,  X  ~  K(a',  f5'),  where  a'  and  f5'  are  given 
by  (10)  below  with  q=\  -p. 


Proof:  Let  x.  =  Pr(M  =  y)  and  x*.  =  Pr(X  =  y  |  T  =  y);  y  =  0, . . . .  Then 


x;Pr(T  =  y)  =  Pr(T  =  y  n  X  =  y) 

=  Pr(T  =  y  nM  =  y  +  j)  (Al) 

=  Pr(T  =  y|M  =  y  +  y)Pr(M  =  y  +  y). 

But  Pr(T  =  y\M  =  y  +  y)  is  the  binomial  probability  ofy  sueeesses  iny  +y  trials,  so, 
letting  q=\  -p. 


X*  Pr  ( T  =  y )  =  +  7 j  p^q^x^^j ;  y  =  0, . . . 


Taking  the  ratio  of  sueeessive  terms  in  (A2),  the  faetor  Pr(T  =  y)  eaneels  and 


'  ^  1  7+1 


cc+P{y+j)\ 
T+7+1  r 


(A2) 


(A3) 


The  first  {  }  faetor  in  (A3)  is  a  ratio  of  eombinatorial  eoeffieients,  and  the  seeond  is  by 
assumption  Xy+j+ilxy+j.  The  two  (y  +y  +1)  faetors  in  (A3)  eaneel,  so  (A3)  is  again  a  linear 
funetion  of y  divided  byy  +  1,  as  was  to  be  shown.  If  a  and  satisfy  (2),  it  is  easy  to 
eheek  that  the  same  is  true  of  the  revised  parameters  a'  and  /?' ,  where 


a'  =  q{a  +  Py)  P'  =  qP. 


(A4) 
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This  concludes  the  proof  of  Theorem  1 . 


□ 


When  the  number  of  mines  eleared  (7)  is  observed,  it  is  likely  that  the  elearanee  times 
Ui,  f/^will  also  be  observed.  These  times  turn  out  to  have  no  additional  value  in 
making  inferenees  about  the  initial  number  of  mines  M,  whether  or  not  M  has  a  Katz 
distribution,  and  therefore  no  value  for  the  residual  number  of  mines  M-  Y.  This  result 
may  seem  eounterintuitive.  If  one  searehes  for  24  hours,  finding  5  mines  in  the  first  hour 
and  none  thereafter,  then  intuition  argues  that  there  are  probably  no  remaining  mines, 
whereas  there  might  be  more  mines  if  the  elearanee  times  were  seattered  over  the  whole 
elearanee  period.  This  intuition  might  be  eorreet  if  the  probability  law  F{  )  governing  the 
deteetion  times  were  unknown,  sinee  there  is  information  about  F{  )  in  the  elearanee 
times.  If  F{  )  is  known,  however  (as  it  must  be  if  the  elearanee  level  is  ealeulable),  then 
the  eorollary  to  Theorem  2  states  that  the  elearanee  times  are  useless. 

Theorem  2:  Let  M be  a  nonnegative  random  variable,  and  let  ...,  Tj^he 
independent,  eontinuous  random  variables  with  eommon  distribution  funetion  F{  ).  Let  t 
be  any  real  number,  let  /,■  indieate  the  event  ( T,-  <  t),  let  7=  /^  +  . . .  +  7^,  and  let 
U  =  {Ui,  ...,  Uy),  where  Uy  ...,  Uy&XQ  the  nondeereasing  order  statisties  of  those  T,-  for 
whieh  Ti  <  7  If  m  andy  are  nonnegative  integers  for  whieh  0  <y  <  m,  and  if 

u  =  {uy  ...,  Uy)  is  a  real  veetor,  then  either  Pr(7  =  y,  M=  m)  =  0,  or 

Pr(U  =  u|7  =  y,M  =  m)  =  Pr(U  =  u|7  =  y) .  (A5) 

Proof:  Both  sides  of  (A5)  are  well  defined  if  Pr(7  =  y,  M  =  m)  >  0.  Furthermore, 

both  are  0  unless  ui<  U2<  ■■■  <Uy  <  t,  so  suppose  that  those  eonditions  hold.  Define  the 
event 

}’  m 

E,„=(M=mY\{T,=u^){^(T,>t).  (A6) 

(=1  i=y+\ 
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Then,  since  the  random  variables  T,  are  all  independent  by  assumption, 


Pr(£^J  =  Pr(M  =  m) 


Y\dF{u,) 


i=\ 


m-y 


(A7) 


The  event  (U  =  u)  n  (M  =  m)  includes  Ey^  and  other  mutually  exclusive  events  that 
have  the  same  probability,  since  the  first  y  of  the  r,-  are  not  necessarily  the  smallest.The 

,  the  number  of  permutations  of  m  things  taken  y  at  a 


yj 


number  of  these  events  is  y ! 
time.  Thus 

Pr(U  =  u,M  =  m)  =  y\ 


y) 


(A8) 


Since  Pr(T  =  y,M  =  m)  =  Pr(M  =  m) 


y) 


F(ty  [1  -  F{t)]"’  ^ ,  it  is  a  simple  matter  to  take 


the  ratio  Pr(U  =  u,  M=  m)/'Pv{Y  =  y,  M  =  m)  to  obtain 


Pr(U  =  \xY  =  y,M  =  m)  = 


T 


n[^/F(u,)/F(0]. 


(A9) 


But  the  right  hand  side  of  (A9)  does  not  depend  on  m,  so  it  must  also  be  Pr(U  =  u|T  =  y) . 

In  other  words,  conditional  on  (T  =  y)  being  given,  the  order  statistics  U  are  distributed  as 
if  they  were  the  order  statistics  of  the  truncated  distribution  F{  )/F(t),  sampled  y  times.  CU 


Corollary:  If  both  sides  of  (A9)  are  defined, 

Pr(M  =  m|U  =  u,T  =  y)  =  Pr(M  =  m|T  =  y)  . 

Proof:  Apply  Bayes  theorem,  recognizing  that  the  right  hand  side  of  (A9)  does  not 
depend  on  m.  D 


According  to  the  corollary,  once  the  number  of  mines  found  is  known,  there  is  no 
additional  value  in  observing  the  times  at  which  they  were  found. 
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Theorem  3  states  that  the  residual  number  of  mines  is  Katz,  even  if  the  number  of  mines 
removed  is  not  observed. 


Theorem  3:  Let  M  be  the  number  of  mines,  suppose  M'~K{a,  (5),  and  let  Xbe  the 
number  of  mines  in  the  sample  when  eaeh  mine  is  ineluded  with  probability  q, 
independently  of  the  others.  ThenX~K(a',  y0'),  where  a'  and  P'  are  given  by  (A13) 
with  p  =  \  -q. 


Proof: 


Sinee  X  is  binomial  when  M  is  given. 


j=0  i=0 


vV 


I  j-l  I 

q  z 


AlO) 


=  Y,Xj{qz  +  py 


7=0 


(All) 


=  g{qz  +  p-,a,P),  (A12) 

where  g()  is  the  generating  funetion  (3).  Equation  (A1 1)  is  obtained  from  (AlO)  by 
eombining  the  faetors  q^  and  z\  and  then  employing  the  Binomial  Theorem.  Equation 
(A12)  is  obtained  from  (A1 1)  by  reealling  the  definition  of  the  generating  funetion  g(  ). 
After  rearranging  (A12),  X ean  be  shown  to  be  Katz  with  parameters 


a  = 


aq 


l-pp 

If  a  and  P  satisfy  (2),  then  so  do  a'  and  P' . 


P'  = 


\-pp 


(A13) 

□ 


Theorem  4  and  its  eorollaries  deal  with  multiple  mine  types. 

Theorem  4:  Assume  n  independent  mine  populations  Mi,  with  Mi  ~  K(a,-,  Pi), 
i=  1,  ...,n.  The  total  number  of  mines  is  M  =  +. . .+  M„.  The  elearanee  level  for  the 

dh  population  is  pi,  with  qi  =  1  The  number  of  type  i  mines  eleared  is  T,-,  with  Y  = 
Yi  +...+  T„,  and  the  number  remaining  is  A,-,  with  A= +...  + A„.  If  Pi  =  y^for  all  i. 
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thenM~ir(«,  fi),  where  «=«!+...+  a„.  Otherwise,  M does  not  have  a  Katz 
distribution. 

Proof:  Sinee  the  Mi  are  all  independent,  the  generating  funetion  of  M  is  the 

produet  of  the  individual  generating  funetions  (Johnson  and  Kotz,  op.  cit.,  p.  21) 

g(z)  =  n[(l-/?, )/(!-/!, z)r“.  (AM) 

i=\ 

If  Pi  =  P  for  all  i,  then  (A14)  reduces  to  g(z)  =  [(1  -  P)l{\  -  ,  the  generating 

function  of  a  Katz  random  variable.  Otherwise,  (A14)  does  not  have  the  required  form 
and  M  is  therefore  not  Katz.  □ 


Corollary  1:  If  qiPi  =  P,i  =  and  if  7,-  is  observed  for  /  =  1 ,  ...,«,  then 


X-K 


'^{aiq,+  pYp,p 


\i=\ 


Proof:  According  to  (A4),  X.  ~  K{a^q.  +  PY.,q.pp  when  Y.  is  given.  Since 

q^P^  =  P,  the  conclusion  that  Ahas  a  Katz  distribution  then  follows  from  Theorem  4.  O 

Corollary  2:  Suppose  qi  =  {HPi  -  \)l{\ip-  1)  for  some  parameter  p,i=\,  Then 
X  ~  K(ajoT’  where 

^TOT  “  (A15). 

!=1 


Alternatively,  if  pi  =  {H  Pi  -  \)l{\ip-  1)  for  /  =  1,  . . .,  n,  then  Y  ~  K(ajQx,  P).  If  Pi  =  0  for 
all  i,  then  take  pi Pi  to  be  qi  (for  A)  or  pi  (for  Y)  in  (A15). 

Proof:  The  condition  on  qi  enforces  P’l  =  P  and  a'l  =  P^  in  (A14),  which  applies 
when  the  number  of  mines  cleared  is  not  observed.  The  conclusion  then  follows  from 
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Theorem  4.  If  the  eondition  on  pi  holds,  then  the  same  logie  applies  to  T,  the  number  of 
mines  not  removed.  □ 


Theorem  5  deals  with  methods  for  sampling  a  mixed  minefield  when  the  Katz 
eomponents are  all  independent.  One  eould,  of  eourse,  sample  n  Katz  random 
variables,  but  it  will  generally  be  more  effieient  to  sample  M  (the  total),  and  then  divide 
it  into  n  parts.  Theorem  5  relies  on  the  faet  that  a  eertain  distribution  is  of  the  MCK  type, 
as  explained  below. 

If  K(a,  /3),  then  a  elosed-form  expression  for  the  probability  mass  funetion  of  M, 
valid  if  0  or  p>  0,  is 

P{M  =  m)  =  (1  -  m>0.  (A16) 

m\ 

The  notation  (x)„  is  taken  from  Feller  (1957)  where  (x)^  is  defined  to  be  x{x  -  1)  . . . 

(x  -  m  +  1)  for  m  >  1,  with  (x)o  s  1  (m  is  a  nonnegative  integer,  butx  ean  be  any  real 
number).  The  limit  as  0  produees  a  Poisson  distribution,  so  in  that  sense  (A16)  is 
valid  for  all  (a,  satisfying  (2).  If  M,-  ~  K(a,-,  fi),  and  if  . . .,  are  all  independent, 
thenM~K(c!;  aeeording  to  Theorem  3.  Let  M  s  (M^,  ...,M„),  andms(mi, 

Then 


T’(M  =  m  M  =  m)  = 


(A17) 


P(M  =  m) 

All  of  the  faetors  involving  (1  -  and  raised  to  powers  eaneel  in  (A17),  leaving 

t\  (“^i  / 

I  1  (in 

P(M  =  m\M  =  m)  = — ;  m,.  >0, m  =  m, +...  +  m„ .  (A18) 

{m)\ 
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The  distribution  (A18)  will  be  referred  to  as  a  “multivariate  eonditional  Katz  distribution 
with  parameters  a,  (3,  m  and  n”,  or  MCK  for  short.  The  MCK  distribution  is  a 
multivariate  hypergeometrie  distribution  when  <  0,  or  a  multinomial  distribution  in  the 
limit  as  0  (Johnson  and  Kotz,  op.  cit,  p.  281).  When  P>  0,  the  MCK  distribution 
has  been  ealled  a  multivariate  Polya-Eggenberger  distribution  (Johnson  and  Kotz,  1977) 
on  aeeount  of  its  relationship  to  eertain  urn-sampling  sehemes,  or  the  multivariate  Polya 
distribution  (Janardin  and  Patil,  1970).  Thinking  of  as  the  number  of  balls  in  an  urn 
leads  to  a  praetieal  way  of  generating  M  in  a  Monte  Carlo  simulation,  sinee  only  a  single 
Katz  sample  of  the  total  Mis  really  required.  This  is  the  gist  of  Theorem  5. 


Theorem  5:  Let  M~ K(«,  /3),  where  a=  ai  +...+  a,-  >  0  for  1  <  /  <  n.  The  pair 

(«,-,  /3)  is  assumed  to  satisfy  (2)  for  1  <  /  <  n.  Consider  the  following  proeedure  for 
plaeing  M balls  in  n  urns.  For  k=  0,  . . .,  M-  1,  the  A:  +  1®^  ball  is  plaeed  in  urn  i  with 
probability  where 


Pi  = 


(^i+pkj 
a  +  pk  ' 


(A19) 


and  where  p  is  the  number  of  balls  already  in  um  i.  If  M,-  is  the  number  of  balls  finally 
plaeed  in  urn  i,  then  Mi  ~  K(a,-,  P),  and  all  of  the  Mi  are  independent  of  eaeh  other, 

i  =  \,  IZI 


Proof:  Let  M  s  (Mj,  . . .,  M„),  and  m  s  (mi,  . . .,  m„).  It  will  be  shown  by 

induetion  that  /’(M  =  m|M  =  m)  is  given  by  (24)  for  m  >  0.  Sinee  (24)  is  equivalent  to 

(23),  the  theorem  follows  upon  removing  the  eondition  on  M. 

Let  Q(\n)  be  /’(M  =  m|  M  =  Wj  + . . .  +  ) ,  and  note  that  Q(0)  =  1 ,  a  speeial  ease  of 

(A18)  where  +. . .+  =  0.  Suppose  Q(m)  is  given  by  (A18)  for  all  m  sueh  that 

mi  +...  +  =  A:;  let  e,-  be  an  n-veetor  all  of  whose  eomponents  are  zero  exeept  for 

eomponent  i,  whieh  is  1;  and  let  A:,-  =  m,-  -  1,  /  =  1,  . . .,  n  (if  p  <  0,  then  the  eorresponding 
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term  may  be  omitted  from  (A20)  below).  Then,  eonditioning  on  the  ball  eonfiguration 
after  k  balls  have  been  plaeed, 

e(ni)  =  Xe(ni-e,)^i§i  (A20) 

“T  a  +  /3k 

n 

where  m  is  now  any  eonfiguration  sueh  that  ^ =k-\-l.  Q(in  -  e,)  on  the  right  hand 

i=l 

side  of  (A20)  is  by  assumption  given  by  (A  18),  and  it  is  now  only  a  matter  of  some 
algebra  to  eonelude  that  Q{in)  on  the  left  hand  side  is  also  given  by  (A18).  Sinee  m  is 
arbitrary  exeept  for  its  sum,  this  eompletes  the  induetive  proof.  IZI 

Comment:  When  /3=  0,  equation  (A  19)  makes  pi  =  a/ a  for  every  ball.  The  faet  that  a 
Poisson  random  variable  produees  independent  Poisson  parts  when  partitioned  in  this 

manner  is  well  known  (e.g.  Ross  (1993)).  When  0,  if  eaeh  ball  is  plaeed  in  um  i  with 
probability  a^la,  instead  of  aeeording  to  (A19),  then  by  Theorem  3  M.  ~  K{a[,p'), 

where  a[  =  aj{l  -  -  aja))  and  =  p{aja)l{l  -  -  aja)) .  E{M/)  is  still  a/(l  - 

f3),  but  it  is  not  true  that  Mj  ~  A'(«y,  /3),  and  furthermore  Mi,  . . .,  are  not  mutually 
independent.  These  latter  properties  require  that  the  balls  be  alloeated  aeeording  to 
(A19). 
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